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Tuning multi-user Smalltalk JayAlmarade 


A n important activity before delivering any applica¬ 
tion is tuning the code to meet performance re- 
i quirements In single-user Smalltalks on the client 
machine, this activity typically involves using profiling 
tools to identify the methods where most time is spent. 
Once these methods are identified, several options are 
available, such as implementing the methods as primi¬ 
tives in C code, caching calculated values that are used 
repeatedly, or perhaps most important, producing a bet¬ 
ter design. This tuning activity might also involve analyz¬ 
ing the memory usage of the application, reducing the 
memory footprint of the application while it is running, 
minimizing the number of temporary objects that are cre¬ 
ated and then quickly garbage collected, and exercising 
more explicit control over garbage collection (especially 
in real-time systems). 

These same tuning activities are applicable to multi¬ 
user, server Smalltalk as well. In addition, because server 
Smal Ital k must accommodate concurrent transactions by 
many hundreds of users, and must handle many millions 
of objects bei ng created and retrieved, there are addition¬ 
al ways in which applications can be tuned. In this col¬ 
umn I discuss some of the techniques to tune multi-user, 
server Smalltalkapplications. 

A key component in tuning a large-scale, multi-user 
Smalltalk application is understanding and controlling 
the placement of objects on disk. Because the number 
and size of objects may prevent all that are being used in 
an application from being present in RAM at once, the 
proximity of objects may impact application perfor¬ 
mance. Obviously, the fewer disk pages to be accessed 
during the normal course of application execution the 
better performance. To tune the placement of objects 
on disk, server Smalltalk must allow developers to cluster 
objects that are frequently accessed together. In 
GemStone Smalltalk, objects are placed on disk based on 
their access patterns by default. More specifically, objects 
that are created or modified within the same transaction 
tend to be placed close together. In many cases, this de¬ 
fault placement is sufficient. 

However, GemStone Smalltalk does provide additional 
protocol to al low developers to discover where objects are 
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placed and to move them closer together for more effi¬ 
cient access. 

The first step in tuning an application’s performance 
for accessing objects is to understand the reading and 
writing characteristics of the application while it is run¬ 
ning. In GemStone Smalltalk, you can send the messages 
"pageReads" or "pageWrites" to class System to get the cu¬ 
mulative number of pages that were read or written si nee 
the session began (i.e., si nee you logged into the server). 
Typically, it is useful to measure the number of pages read 
immediately before and immediately after an extensive 
calculation or query to determine if clustering objects 
together might be of benefit. For example, the following 
code returns the number of pages that were read to exe¬ 
cute the given query. 

| initialNumberOfReads | 

initialNumberOfReads := System pageReads. 

SetOfPersons select: [ :person | 

" find each person younger than their spouse " 
person isMarried and: [ person spouse age > person age ]. 

A System pageReads - initialNumberOfReads 

Pages are written to disk for two reasons: first, when inter¬ 
nal buffers become full and must make room for new 
objects to be created; second, when the transaction is 
committed. Measuring the number of pages written at 
various times during the life of a transaction can help 
determine if buffer sizes need to be increased, whereas 
measuring the number of pages written just before and 
after a transaction is committed may help determine if 
more expl icit control over clusteri ng may hel p. 

Clustering related objects together solves a specific 
problem: poor performance because of too much disk ac¬ 
tivity. One way to check how objects are clustered is to 
determine which pages the objects are stored on. You can 
send the message "page” to any object to get an integer 
identifying the disk page on which the receiver resides. 
This integer is a logical identifier of the page, not a point¬ 
er to a storage location. 

Objects are stored persistently in structures called ex¬ 
tents. An extent is a disk file or raw partition on disk. The 
repository of all objects can be maintained in multiple 
extents, possibly distributed among several diskdrives on 
several machines. In GemStone Smalltalk, there is a single 
object, named SystemRepository, that is an instance of 
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class Repository. In addition to defining protocol to per¬ 
form online backups and restores, to dynamically add 
new extents, and to create replicates of extents for pur¬ 
poses of faulttolerance, class Repository also has methods 
to provide information about the extent in which a page is 
located and the file name for a given extent. The next 
example shows how one can determine the file name 
where an object is actually stored on disk. 

| extendid | 

"MyObject is the object whose location we are interested in." 

extendid := SystemRepository extentForPage: MyObject 

page. 

''SystemRepository fileNames at: extendid. 

By analyzing the reading and writing behavior of your 
application for excessive disk activity and determining 
the number and location of pages where objects reside, 
performance may be improved by explicitly controlling 
how objects are clustered together. Conceptually, you 
can think of objects as being written to disk on a stream 
of disk pages. When a page is filled, another page is cho¬ 
sen and objects are written to the new page. The stream 
of pages used for writing is called a 'bucket'. GemStone 
Smalltalk provides the class ClusterBucket to give pro¬ 
grammers control over which stream of pages objects are 
written. Every object is associated with an instance of 
ClusterBucket and all objects assigned to the same 
ClusterBucket will be clustered together. When objects 
with the same ClusterBucket are written to disk, they are 
written to contiguous locations on the same page, if they 
will fit, or contiguous locations on several pages if not. 

A ClusterBucket can be associated with a specific extent. 
Each ClusterBucket has an instance variable extentld that 
specifies which file the stream of pages will be written. 
You can find out what extents are available by executing 
the expression SystemRepository fi leSizeReport. This returns 
a stri ng that descri bes the extent identifier, fi le name, file 
size, and space available for each available extent. An 
example of how to set the extent for an existing 
ClusterBucket is the expression aClusterBucket extendi d: 3. 

You can create a new ClusterBucket by executing the 
expression ClusterBucket newForExtent: 4. Initially, there 
are seven existing instances of ClusterBucket maintained 
in a global array named AIICIusterBuckets. 

Someoftheseare availablefor application developers, 
whereas others are used to cluster system objects, such 
as kernel methods or source code strings. When new 

instances of Cluster 
Bucket are created, 
they are added 
to this global array 
and a ClusterBucket's 
position in this ar¬ 
ray is known as its 
cluster Id. 

This provides a 
way to reference any 
Figure 1. Employee schema. ClusterBucket that ex¬ 


ists through its clusterld, for example, by performing the 
expression ClusterBucket bucketWithld: 7. 

To specify the ClusterBucket for a particular object, you 
can send the message "clusterlnBucket: aClusterBucket". 
This will not immediately write the object to disk but in¬ 
dicates that when it is next written, the stream of pages 
in which it is written will be determined by the given 
ClusterBucket. 

If you want to write the object to disk immediately, 
you can send the message "moveToDiskl nBucket: aCluster 
Bucket". Sending the "ClusterBucket” message to an object 
will return the ClusterBucket to which the receiver is cur¬ 
rently assigned. GemStone Smalltalk provides some con¬ 
venience methods to help cluster objects. You can send 
the message "cluster" to any object to assign it to the cur¬ 
rent default ClusterBucket. You can use this message to 
build specialized clustering behaviors for your applica¬ 
tion classes. One such method already provided is 
clusterDepth Fi rst, which traverses through the named and 
indexable instance variables of the receiver, sending the 
"cluster" message to each object. The cluster method re¬ 
turns a boolean indicating if the receiver has already been 
clustered during the current transaction. This is used to 
prevent infinite recursion. There are also convenience 
methods defined in class Behavior to cluster classes and 
related objects. The clusterBehavior method clusters a 
class and its method dictionary. The clusterDescription 
method clusters the objects that descri be the structure of 
a class, such as its instVarNames array, class variables, in¬ 
stance variable constraints, and class history. 

To illustrate how to control object clustering, imagine 
a set of Employee objects based on the simplified schema 
illustrated in Figure 1. 

Suppose most applications that access an instance of 
Employee also access the name and ssn as well; so we 
would like to cluster instances of Employee with their cor¬ 
responding Name and 'ssn' String objects. 

The addresses of employees are accessed less frequent¬ 
ly and are typically accessed for all employees at once, so 
we would like to cluster all Address objects together. The 
fol lowi ng code shows how we can cluster these objects so 
that empI oyees and thei r freq uentl y accessed su bco m po- 
nents are stored contiguously and employee addresses 
are grouped together separately. 

| empCIuster addressCluster | 

" get the bucket previously created for Employees" 

empCIuster : = ClusterBucket bucketWithld: 8. 

" get the bucket previously created for Addresses" 

addressCluster : = ClusterBucket bucketWithld: 9. 

TheSetOfEmployees do: [ :anEmp | | name address] 
anEmp clusterl nBucket: empCIuster. 
anEmp ssn clusterl nBucket: empCIuster. 

" cluster the name and its components" 

name : = anEmp name. 

name clusterl nBucket: empCIuster. 

continued on page32 



March-April 1996 


http://www.sigs.com 


19 





G ETTIN G REAL continued from page 19 

name first clusterl nBucket: empCIuster. 
name middle clusterl nBucket: empCIuster. 
name last clusterl nBucket: empCIuster. 

" cluster the address and its components" 
address :=anEmp address, 
address clusterl nBucket: addressCluster. 
address street clusterl nBucket: addressCluster. 
address city clusterl nBucket: addressCluster. 
address state clusterl nBucket: addressCluster. 
address zip clusterl nBucket: addressCluster. ]. 

Thiscolumn has described how to determine if clustering 
objects might help application performance and how to 
cluster objects using ClusterBuckets. My next column will 
discuss how to measure overall system performance and 
steps for tuning multi-user Smalltalk for higher transac¬ 
tion throughput. K 


THE BEST OF COMP.LANG.SMALLTALK 

continued from page23 

•Avoid commitment—This is another way of 
expressi ng the princi pie of postponi ng decisions but 
one that might strikeachord with younger or 
unmarried programmers. 

• It’s not a good example if it doesn’t work—This one 
comes from David Buck (dbuck@magmacom.com), 
who’s fed up with looking at example and test 
methods that haven’t been properly maintained as the 
codeevolved. I can'tthinkof a wayto applythisto life 
but it's good advice anyway. 

• Steal everything you can from your parents—A 
princi pie for those tryi ng to make effective use of 
inheritance or movi ng i nto their fi rst apartment. 

• Cover your a**—Like in a bureaucracy, the most 

i mportant thi ng i s to make su re that it i sn't your fault. 
M ake sure your code won’t have a problem even if 
thi ngs are goi ng very wrong elsewhere. SI 


SEQUENTIAL KEY ALLOCATION 

continued from page26 

ifTrue: [ keyCache : = self nextKeys: self 
keyCacheSize ]. 

key : = keyCache first. 
keyCache removeFirst. 

"key. 

If you choose to make the array optimization in the 
nextKeys: method, this method must be changed to insert 
nil values into the array as each key gets returned rather 
than using the removeFirst selector, ffl 
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